Goto

Collaborating Authors

 historical control data


Bayesian Prognostic Covariate Adjustment With Additive Mixture Priors

arXiv.org Machine Learning

Effective and rapid decision-making from randomized controlled trials (RCTs) requires unbiased and precise treatment effect inferences. Two strategies to address this requirement are to adjust for covariates that are highly correlated with the outcome, and to leverage historical control information via Bayes' theorem. We propose a new Bayesian prognostic covariate adjustment methodology, referred to as Bayesian PROCOVA, that combines these two strategies. Covariate adjustment in Bayesian PROCOVA is based on generative artificial intelligence (AI) algorithms that construct a digital twin generator (DTG) for RCT participants. The DTG is trained on historical control data and yields a digital twin (DT) probability distribution for each RCT participant's outcome under the control treatment. The expectation of the DT distribution, referred to as the prognostic score, defines the covariate for adjustment. Historical control information is leveraged via an additive mixture prior with two components: an informative prior probability distribution specified based on historical control data, and a weakly informative prior distribution. The mixture weight determines the extent to which posterior inferences are drawn from the informative component, versus the weakly informative component. This weight has a prior distribution as well, and so the entire additive mixture prior is completely pre-specifiable without involving any RCT information. We establish an efficient Gibbs algorithm for sampling from the posterior distribution, and derive closed-form expressions for the posterior mean and variance of the treatment effect parameter conditional on the weight, in Bayesian PROCOVA. We evaluate efficiency gains of Bayesian PROCOVA via its bias control and variance reduction compared to frequentist PROCOVA in simulation studies that encompass different discrepancies. These gains translate to smaller RCTs.


Dynamic Borrowing Method for Historical Information Using a Frequentist Approach for Hybrid Control Design

arXiv.org Machine Learning

Information borrowing from historical data is gaining attention in clinical trials of rare and pediatric diseases, where statistical power may be insufficient for confirmation of efficacy if the sample size is small. Although Bayesian information borrowing methods are well established, test-then-pool and equivalence-based test-then-pool methods have recently been proposed as frequentist methods to determine whether historical data should be used for statistical hypothesis testing. Depending on the results of the hypothesis testing, historical data may not be usable. This paper proposes a dynamic borrowing method for historical information based on the similarity between current and historical data. In our proposed method of dynamic information borrowing, as in Bayesian dynamic borrowing, the amount of borrowing ranges from 0% to 100%. We propose two methods using the density function of the t-distribution and a logistic function as a similarity measure. We evaluate the performance of the proposed methods through Monte Carlo simulations. We demonstrate the usefulness of borrowing information by reanalyzing actual clinical trial data.


Optimal Experimental Design for Staggered Rollouts

arXiv.org Machine Learning

Experimentation has become an increasingly prevalent tool for guiding policy choices, firm decisions, and product innovation. A common hurdle in designing experiments is the lack of statistical power. In this paper, we study optimal multi-period experimental design under the constraint that the treatment cannot be easily removed once implemented; for example, a government or firm might implement treatment in different geographies at different times, where the treatment cannot be easily removed due to practical constraints. The design problem is to select which units to treat at which time, intending to test hypotheses about the effect of the treatment. When the potential outcome is a linear function of a unit effect, a time effect, and observed discrete covariates, we provide an analytically feasible solution to the design problem where the variance of the estimator for the treatment effect is at most 1+O(1/N^2) times the variance of the optimal design, where N is the number of units. This solution assigns units in a staggered treatment adoption pattern, where the proportion treated is a linear function of time. In the general setting where outcomes depend on latent covariates, we show that historical data can be utilized in the optimal design. We propose a data-driven local search algorithm with the minimax decision criterion to assign units to treatment times. We demonstrate that our approach improves upon benchmark experimental designs through synthetic experiments on real-world data sets from several domains, including healthcare, finance, and retail. Finally, we consider the case where the treatment effect changes with the time of treatment, showing that the optimal design treats a smaller fraction of units at the beginning and a greater share at the end.